-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Packaging] Optimize Dockerfile build #25184
Conversation
Thank you for your contribution hholst80! We will review the pull request and get back to you soon. |
@microsoft-github-policy-service agree |
Packaging |
2 weeks of no activity. I am pretty sure I did not break anything with my changes. |
Dockerfile
Outdated
COPY --from=builder /azure-cli/az.completion /root/.bashrc | ||
COPY --from=builder /usr/local /usr/local | ||
COPY --from=tools /usr/local/bin/jp /usr/local/bin/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is ambiguous and hard to maintain.
Alternatively, we can modify the build script to install and delete the src in one step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that is strictly worse, but I see the point. We can move jp into the builder step and then we have just two COPY commands that would have very little reason for change. We could also do this:
COPY --from=builder /usr/local /usr/local
COPY --from=tools /usr/local /usr/local
I do not think az.completion should be included in the environment to begin with, nor should jp, but here we are. I only want to make the existing solution smaller. Without sacrificing readability or maintainability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think that az.competion.sh should be in /root is what I mean. The jp binary is a single static executable. If the user wants it inside az-cli they can just inject it at container creation or have it mounted. There is no need to have it bundled with az unless there is a runtime dependency to it from az cli itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this line is not safe COPY --from=builder /usr/local /usr/local
. What if the installed package also modifies another folder. We can't guarantee that everything we need in that folder. (Although it might be true.)
I prefer using a single RUN command to do the same thing. Like the CPython dockerfile
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this multistage solution, we do not carry over any potential residual components from the build process. That is very hard to achieve with RUN in a single stage. There is always junk left behind and not to mention it is hard to read unmaintainable. There is also observational bias to this commonality of long-winded single-stage builds: nobody dares to change anything because they do not understand the consequences of their changes. Not the best practice, that is not a place where we want to be.
At first, that can seem like a reasonable fear that a generic install script could install something outside the defined install location. To overcome our fear it is enough to open up the install script to see what it does: pip install exclusively. While it could be possible for a setup.py to do something incredibly stupid here the python pip framework is reasonably opinionated on what can and should be done (too opinionated if you ask me). If any files are installed they will be put in the target specified by pip configuration.
Do you feel strong about the /usr/local copy I might be able to fix that. But if you are against the multi-stage I cannot fix that. That is a requirement for a better more maintainable Dockerfile with a smaller build artifact that also can be generated more quickly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not against mult-stage build. If CLI is a Node or Go project, I definitely choose the multi-stage build. I'm sure everything I need is the dist
folder or the binary.
The new dockerfile also remove bash
, git
and other packages. To maintain compatibility, we also need to figure out what file is changed during installation, which is more difficult than pip installation.
PS: I know we should not keep irrelevant packages into image. In order to prevent issues like #24836, we still keep them for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me know if you would prefer this kind of solution instead.
https://github.com/hholst80/azure-cli/tree/dockerfile-multistage-alt
Including the |
the latest image has been reduced to 1GB
But still not enough. can you confirm with latest version 2.51.0, if your PR still can reduce extra 200MB? ====== some discussions in #19591 and add my comments and discoveries. The reason why it is so big, because, For example, for We need know how to install with latest version only.
In general, this tool is installed about 33 times, that's why it is so big. |
We previously trimmed Python SDKs for Windows and Linux packages: #23946. We will apply the same logic to docker images. |
Thanks for the update. Please share the ticket or issue for docker image so I can follow up (subsribe it) |
We're down to 485 MB now (compressed). If we keep this PR open long enough, we will eventually get it below 100 MB. ;-) FYI, keeping all API versions costs 115 MB (again, compressed) If anyone can give a green light to remove the "useful tools" included (jq, curl, and jp) we can get the image size down to 438 MB. Actually, if we are very aggressive, removing git, perl, and such too, below 400 MB is possible. This will break workflows (I was myself using curl inside the container), and I am not sure that the break is worth the small extra saving in size. It is however odd that the container exposed this tooling to being with because now there are users using it. Including myself... |
🔄AzureCLI-FullTest
|
️✔️AzureCLI-BreakingChangeTest
|
Do we need all of these installed? Are these just needed for building or do we shell out to them? # We don't use openssl (3.0) for now. We only install it so that users can use it.
# libintl and icu-libs - required by azure devops artifact (az extension add --name azure-devops)
RUN apk add --no-cache \
bash bash-completion openssh-keygen ca-certificates openssl \
curl jq git perl zip \
libintl icu-libs libc6-compat \
&& update-ca-certificates The |
@hholst80 - Can you look into using a Chiselled image? https://ubuntu.com/blog/benefits-chiselled-ubuntu-images-aspnet-eshop-demo |
I think it is out of scope for this PR to remove or add to existing tooling. Doing so will break between zero, 1, and a million users workflows so it requires some sensitivity and cost benefit analysis etc. Removing curl broke my own workflows so it is a real problem not just theoretical issue. This PR takes forever to get across and it's getting better and better. But it will not provide much value until it is merged. |
@jiasli - I'm wondering if we have any CLI perf tests to see if removing the pycache impacts runtime perf. |
Compare it to the overhead of starting the container, I'd assume it's negligible. Actually, the total time could be larger, since the image layers need to be unpacked and copied on storage. It could be cheaper to just build the byte code on the fly. Edit: validation of assumptions Benchmark with and without overhead of starting the docker container itself
Running
Running
Runnig something more heavy, like Running
Running
Here I realized that the diff --git a/Dockerfile b/Dockerfile
index 794af5e3b..1c85845cf 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -33,7 +33,7 @@ COPY . /azure-cli
RUN ./scripts/install_full.sh
RUN python3 ./scripts/trim_sdk.py
RUN /usr/local/bin/az && bash -c "source /usr/local/bin/az.completion.sh"
-RUN find /usr/local -name __pycache__ | xargs rm -rf
+RUN find /usr/local/lib/python*/site-packages/azure -name __pycache__ | xargs rm -rf
# Install jp tool
RUN <<HERE Now the timings are much more comparable, but the size increased just slightly. The delta is likely to be due to dependencies on azure package outside its namespace.
|
Dockerfile
Outdated
|
||
RUN arch=$(arch | sed s/aarch64/arm64/ | sed s/x86_64/amd64/) && curl -L https://github.com/jmespath/jp/releases/download/${JP_VERSION}/jp-linux-$arch -o /usr/local/bin/jp \ | ||
&& chmod +x /usr/local/bin/jp | ||
RUN --mount=from=builder,target=/mnt cd /mnt/usr/local && cp -ru . /usr/local/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't grasp why this is needed. What's the scenario where mounting produces a benefit?
c2581bf
to
a4653dd
Compare
This was an oversight in a merge from my side. It is included now, with support for the same architecture as in dev batch. |
Use a multi-stage build to trim of significant overweight. * Utilize a multi-stage build * Run ./scripts/trim_sdk.py * Cleanup __pycache__ * Optimize layer merge with dockerfile:1.2 syntax Thanks to @jongio for suggesting the awesome dive tool to find the inefficiency in the layer merge. Additional cleanups to improve maintainability of the Dockerfile: * Remove useless scanelf statement * Removed usage of dos2unix as it is a concern of the checkout * Introduce modern dockerfile:1.4 syntax
@bebound what is your final thoughts on the multi-stage build should we keep it as-is (I reduced it to a single build stage and merged the 'tooling' stage with that). We can reduce it down to a single build stage and remove the .build-deps but I think the result is more ugly. Also for some reason this build is 1MB larger than the multi-stage. Gods know why. I am OK to drop the multi-stage if you think this PR is better without it. # syntax=docker/dockerfile:1.4
#---------------------------------------------------------------------------------------------
# Copyright (c) Microsoft Corporation. All rights reserved.
# Licensed under the MIT License. See License.txt in the project root for license information.
#---------------------------------------------------------------------------------------------
ARG PYTHON_VERSION="3.11"
FROM python:${PYTHON_VERSION}-alpine
ARG JP_VERSION="0.2.1"
ARG CLI_VERSION
# Metadata as defined at http://label-schema.org
ARG BUILD_DATE
LABEL maintainer="Microsoft" \
org.label-schema.schema-version="1.0" \
org.label-schema.vendor="Microsoft" \
org.label-schema.name="Azure CLI" \
org.label-schema.version=$CLI_VERSION \
org.label-schema.license="MIT" \
org.label-schema.description="The Azure CLI is used for all Resource Manager deployments in Azure." \
org.label-schema.url="https://docs.microsoft.com/cli/azure/overview" \
org.label-schema.usage="https://docs.microsoft.com/cli/azure/install-az-cli2#docker" \
org.label-schema.build-date=$BUILD_DATE \
org.label-schema.vcs-url="https://github.com/Azure/azure-cli.git" \
org.label-schema.docker.cmd="docker run -v \${HOME}/.azure:/root/.azure -it mcr.microsoft.com/azure-cli:$CLI_VERSION"
# We don't use openssl (3.0) for now. We only install it so that users can use it.
# libintl and icu-libs - required by azure devops artifact (az extension add --name azure-devops)
RUN apk add --no-cache \
bash bash-completion openssh-keygen ca-certificates openssl \
curl jq git perl zip \
libintl icu-libs libc6-compat \
&& update-ca-certificates
RUN --mount=target=/mnt --mount=type=cache,target=/root/.cache --mount=type=tmpfs,target=/root/.azure <<HERE
apk add --no-cache --virtual .build-deps gcc make openssl-dev libffi-dev musl-dev linux-headers
mkdir /azure-cli
cd /mnt
cp -ar . /azure-cli
cd /azure-cli
./scripts/install_full.sh
python3 ./scripts/trim_sdk.py
cd /
rm -rf /azure-cli
/usr/local/bin/az && bash -c "source /usr/local/bin/az.completion.sh"
ln -s /usr/local/bin/az.completion.sh /etc/profile.d/
find /usr/local/lib/python*/site-packages/azure -name __pycache__ | xargs rm -rf
arch=$(arch | sed s/aarch64/arm64/ | sed s/x86_64/amd64/)
curl -L https://github.com/jmespath/jp/releases/download/${JP_VERSION}/jp-linux-${arch} -o /usr/local/bin/jp
chmod +x /usr/local/bin/jp
apk del --no-cache .build-deps
HERE
ENV AZ_INSTALLER=DOCKER
CMD bash
|
not sure if this change has been implemented, the image is size reduced in latest versions already, the images built after 14th Nov.
But still 200MB+ more than the image I build with this PR. |
The two enhancements metioned in previous comment were merged in 2.54 and images size is dropped to 700MB. There is no plan to keep only py file or pyc file for now.
|
That was not what my benchmark tests showed though. There was no impact on |
@hholst80 I highly doubt your test result that compiling bytecode each time does not slow down the execution. Let me clarify my point.
In conclusion, I prefer keeping pyc. |
I would too if I saw that kind result.
|
Here is the code:
|
OK. So the az command raw without any options seems to take a considerable additional time. There is some 5.5 million lines of Python code so if a significant part of that is pulled in it would of course have a significant impact on the execution time. |
@bebound We can close this then? Perhaps a refactor of the Dockerfile could be interesting at a later time, but the primary purpose was to reduce image size. This has been achieved. |
This PR makes the build Docker artifact smaller by including multi-stage build in Dockerfile,
and trim ~250MB from the exported image.
Update: The image size is now down to 444MB.
This checklist is used to make sure that common guidelines for a pull request are followed.
The PR title and description has followed the guideline in Submitting Pull Requests.
I adhere to the Command Guidelines.
I adhere to the Error Handling Guidelines.